introduction to rmarkdown

introduction

markdown .md

stripped down html


  • intended to be as easy-to-read and easy-to-write as possible.
  • intended for one purpose: to be used as a format for writing for the web.
  • syntax is very small, corresponding only to a very small subset of HTML tags.


focus on communicating & disseminating


  • formatting handled automatically
  • clean and legible across platforms and outputs

structure of a website

rmarkdown .Rmd

rmarkdown integrates:

– a documentantion language (.md)

with:

– a programming language (R)


enables literate programming

single document to integrate data analysis with textual representations, linking data, code, and text


outputs


Rmarkdown & reproducibility


Computational science has led to exciting new developments:

  • Technology is increasing data collection throughput; data are more complex and highdimensional
  • Existing databases can be merged to become bigger databases
  • Computing power allows more sophisticated analyses, even on “small” data
  • For every field “X” there is a “Computational X”


Increasing computational complexity of analyses:

has exposed limitations in our ability to evaluate published findings.

  • Even basic analyses difficult to describe

  • Errors more easily introduced into long analysis pipelines

  • Knowledge transfer is inhibited

  • Results are difficult to replicate or reproduce

  • Complicated analyses cannot be trusted



calls for reproducibility


Reproducibility has the potential to serve as a minimum standard for judging scientific claims when full independent replication of a study is not possible.

  • fully scripted analyses
  • make code and data available


a reproducible workflow


reproducibility limitations

  • top down
  • downstream (post publication)
  • ultimately does not address the key question:

    can we trust these results?


evidence based science

evdence needs:

  • documenting
  • linking
  • communicating


rmarkdown can integrate tools, processes and outputs into evidence streams that are easily shareable

at all stages of scientific process


simple tools:

low hanging fruit

  • begin at the start of the process
  • document & interlink evidence streams
  • explore and communicate!

empower your code and data


Science and the web

why sharing is important

To help solve these problems, we make a number of suggestions including providing blog posts or videos to explain new methods in less technical terms, encouraging reproducibility and code sharing, making wiki-style pages summarising the literature on popular methods, more careful consideration and testing of whether a method is appropriate for a given question/data set, increased collaboration, and a shift from publishing purely novel methods to publishing improvements to existing methods and ways of detecting biases or testing model fit. Many of these points are applicable across methods in ecology and evolution, not just phylogenetic comparative methods.


Let’s go have a look

Open your first .Rmd!!




Elements of .Rmd


YAML header

outputs

md basics


text

    normal text

normal text

    *italic text*

italic text

    **bold text**

bold text

    ***bold italic text***

bold italic text

    superscript^2^

superscript2

    ~~strikethrough~~

strikethrough


headers

unordered lists

ordered lists

quotes & code

> this text will be quoted

this text will be quoted

`this text will appear as code` inline

this text will appear as code inline

a <- 10
    the value of parameter *a* is 10

the value of parameter a is 10


images

    ![](https://www.rstudio.com/wp-content/uploads/2015/01/rmarkdown-cheatsheet-2-e1457627578814.png)
    
    ![](assets/cheat.png)
    

resize images

    <img src="assets/cheat.png" width="200px" />

basic tables

Table Header  | Second Header
------------- | -------------
Cell 1        | Cell 2
Cell 3        | Cell 4 
Table Header Second Header
Cell 1 Cell 2
Cell 3 Cell 4

online .md table converter


chunks

R code chunks can be used as a means render R output into documents or to simply display code for illustration

options

for more details see http://yihui.name/knitr/

set default options

knitr::opts_chunk$set(echo = TRUE, warning = F, message = F)


extras

knitr::kable() tables

require(knitr)
data(airquality)
kable(head(airquality), caption = "New York Air Quality Measurements")
New York Air Quality Measurements
Ozone Solar.R Wind Temp Month Day
41 190 7.4 67 5 1
36 118 8.0 72 5 2
12 149 12.6 74 5 3
18 313 11.5 62 5 4
NA NA 14.3 56 5 5
28 NA 14.9 66 5 6


DT::datatable() tables

require(DT)
data(airquality)
datatable(airquality, caption = "New York Air Quality Measurements")


plotly

library(plotly)

set.seed(100)
d <- diamonds[sample(nrow(diamonds), 1000), ]

p <- ggplot(data = d, aes(x = carat, y = price)) +
  geom_point(aes(text = paste("Clarity:", clarity)), size = 1) +
  geom_smooth(aes(colour = cut, fill = cut)) + facet_wrap(~ cut)

ggplotly(p)


Exercise

your mission

create your first .Rmd!

see my example: beavers! html - raw .Rmd